Clique (graph theory)

In the mathematical area of graph theory, a clique in an undirected graph is a subset of its vertices such that every two vertices in the subset are connected by an edge. Cliques are one of the basic concepts of graph theory and are used in many other mathematical problems and constructions on graphs. Cliques have also been studied in computer science: finding whether there is a clique of a given size in a graph (the clique problem) is NP-complete, but despite this hardness result many algorithms for finding cliques have been studied.

Although the study of complete subgraphs goes back at least to the graph-theoretic reformulation of Ramsey theory by Erdős & Szekeres (1935),[1] the term "clique" comes from Luce & Perry (1949), who used complete subgraphs in social networks to model cliques of people; that is, groups of people all of whom know each other. Cliques have many other applications in the sciences and particularly in bioinformatics.

Contents

Definitions

A clique in an undirected graph G = (VE) is a subset of the vertex set C ⊆ V, such that for every two vertices in C, there exists an edge connecting the two. This is equivalent to saying that the subgraph induced by C is complete (in some cases, the term clique may also refer to the subgraph).

A maximal clique is a clique that cannot be extended by including one more adjacent vertex, that is, a clique which does not exist exclusively within the vertex set of a larger clique.

A maximum clique is a clique of the largest possible size in a given graph. The clique number ω(G) of a graph G is the number of vertices in a maximum clique in G. The intersection number of G is the smallest number of cliques that together cover all edges of G.

The opposite of a clique is an independent set, in the sense that every clique corresponds to an independent set in the complement graph. The clique cover problem concerns finding as few cliques as possible that include every vertex in the graph. A related concept is a biclique, a complete bipartite subgraph. The bipartite dimension of a graph is the minimum number of bicliques needed to cover all the edges of the graph.

Mathematics

Mathematical results concerning cliques include the following.

Several important classes of graphs may be defined by their cliques:

Additionally, many other mathematical constructions involve cliques in graphs. Among them,

Closely related concepts to complete subgraphs are subdivisions of complete graphs and complete graph minors. In particular, Kuratowski's theorem and Wagner's theorem characterize planar graphs by forbidden complete and complete bipartite subdivisions and minors, respectively.

Computer science

In computer science, the clique problem is the computational problem of finding a maximum clique, or all cliques, in a given graph. It is NP-complete, one of Karp's 21 NP-complete problems (Karp 1972). It is also fixed-parameter intractable, and hard to approximate. Nevertheless, many algorithms for computing cliques have been developed, either running in exponential time (such as the Bron–Kerbosch algorithm) or specialized to graph families such as planar graphs or perfect graphs for which the problem can be solved in polynomial time.

Applications

The word "clique", in its graph-theoretic usage, arose from the work of Luce & Perry (1949), who used complete subgraphs to model cliques (groups of people who all know each other) in social networks. For continued efforts to model social cliques graph-theoretically, see e.g. Alba (1973), Peay (1974), and Doreian & Woodard (1994).

Many different problems from bioinformatics have been modeled using cliques. For instance, Ben-Dor, Shamir & Yakhini (1999) model the problem of clustering gene expression data as one of finding the minimum number of changes needed to transform a graph describing the data into a graph formed as the disjoint union of cliques; Tanay & Sharan (Shamir) discuss a similar biclustering problem for expression data in which the clusters are required be cliques. Sugihara (1984) uses cliques to model ecological niches in food webs. Day & Sankoff (1986) describe the problem of inferring evolutionary trees as one of finding maximum cliques in a graph that has as its vertices characteristics of the species, where two vertices share an edge if there exists a perfect phylogeny combining those two characters. Samudrala & Moult (1998) model protein structure prediction as a problem of finding cliques in a graph whose vertices represent positions of subunits of the protein. And by searching for cliques in a protein-protein interaction network, Spirin & Mirny (2003) found clusters of proteins that interact closely with each other and have few interactions with proteins outside the cluster. Power graph analysis is a method for simplifying complex biological networks by finding cliques and related structures in these networks.

In electrical engineering, Prihar (1956) uses cliques to analyze communications networks, and Paull & Unger (1959) use them to design efficient circuits for computing partially-specified Boolean functions. Cliques have also been used in automatic test pattern generation: a large clique in an incompatibility graph of possible faults provides a lower bound on the size of a test set.[3] Cong & Smith (1993) describe an application of cliques in finding a hierarchical partition of an electronic circuit into smaller subunits.

In chemistry, Rhodes et al. (2003) use cliques to describe chemicals in a chemical database that have a high degree of similarity with a target structure. Kuhl, Crippen & Friesen (1983) use cliques to model the positions in which two chemicals will bind to each other.

Notes

  1. ^ The earlier work by Kuratowski (1930) characterizing planar graphs by forbidden complete and complete bipartite subgraphs was originally phrased in topological rather than graph-theoretic terms.
  2. ^ Barthélemy, Leclerc & Monjardet (1986), page 200.
  3. ^ Hamzaoglu & Patel (1998).

References

External links